Computer vision

Agenda

Helpers

To perform the tasks, it is necessary to import the libraries used in the script and download the data on which we will be working.

In this script we will be using:

The colab platform requires a special way to display images with opencv. If the notebook is run in collab, execute the following code:

Image in the frequency domain

Previous examples of image processing dealt with the domain of intensity. Another area where an image can be represented is frequency. The frequency domain image should be viewed as information about which pixels contain some frequently repeating pattern and which are little repeating patterns.

For example, an image that is very blurry will have a lot of repeating fragments (blur is a kind of repeating function). However, a very sharp image will have much less such patterns (because neighboring pixels will be very different from each other).

An example of the transition from the intensity domain to the frequency domain is shown below.

Fourier transform

The Fourier transform is given by the following formula:

$${\mathcal{F}}(\omega) = \int_{-\infty}^{\infty} f(x) e^{-2\pi i x \omega} dx$$

for 2d signal:

$${\mathcal{F}}(u,v) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x,y) e^{-2\pi i (ux+vy)} dx dy$$

Where $u,v$ are frequencies, $ f (x,y) $ is an image on an intensity scale, $ i ^ 2 = -1 $ - an imaginary number. The above formula can be interpreted as follows:

The transformation from the intensity domain to the frequency domain is performed using the Fourier transform, which is implemented in OpenCV or NumPy. An example of using the NumPy library is presented below.

The fft2 function performs a Fourier Transform for 2-dimensional signals. Then, using the fftshift function, we shift the data in such a way that the pixels corresponding to the high frequencies are closer to the center and the low frequency pixels at the edges of the image (in the frequency domain).

The inverse transform comes down to performing an inverse pixel shift and performing the operation ifft2, which results in an image in the intensity domain (it is necessary to take the real part by real(), because the result is in a complex form).

Interpretation of the frequency domain

Having the inverse transform (frequency -> intensity), we can do a few experiments to approximate the relationship between pixel values in the frequency domain and the intensity.

Let's define an image (256 x 256) filled with all zeros with high values set at certain positions. These coordinates will be respectively (where $S_h = height / 2 = 128$ and $S_w = width / 2 = 128$):

Similarly, for changing pixel values vertically:

The above experiment provides the following conclusions:

By setting high values further from the center, the inverse Fourier transform operation will create intensity-domain images containing signals of higher frequency.

Operations in the frequency domain

Image processing in the frequency domain is closely related to the concept of intensity convolution. According to the theorem resulting from the Fourier transform (link) we can write:

$$(f \bullet g)(t) = {\mathcal{F}}^{-1}\{F * G\}$$

where:

The above theory says that the operation of the convolution of two functions in the domain of time or space is equivalent to the operation of multiplication (element-wise) of their representation in the frequency space.

Earlier transformations between the domains showed that an image of a certain size (in the intensity domain) would have the same size in the frequency domain. The size directly affects the number of maximum frequencies, hence, in order to present an image of a smaller size in a greater number of frequencies, the image can be completed (as in the case of convolution) with only zeros.

When using the OpenCV implementation, you need to do it manually, while when implementing NumPy, you just need to add the size information as the second parameter of the fft2 function.

Having images in the same domain defined by the same number of frequencies, according to the theory cited, we can perform the equivalent of the convolution operation using simple multiplication in the frequency domain.

Low Pass Filter

Frequency domain filtering can also be divided into low pass and high pass.

Low-pass filtering is one that passes low-frequency signals through and cuts the signals above a certain threshold. As you know, high-frequency signals are at the edges of the image, and the closer to the center, the lower the frequency of the signals.

The low-pass filtering implemented as the cut-off point is shown below.

The result of low-pass filters should usually be a blur effect due to the fact that we get rid of areas of high variability (such as edges, details, but also noise).

High Pass Filter

The filters from the high-pass family work in a similar way. By passing signals of only high frequency, we extract places with edges, sharp image or simply noise.

A high-pass filter implemented as cutting off signals below a certain threshold is presented below.

Task 1

Check and argue which of the filters listed below is low pass and which is high pass.

Filters:

Answer:

Task 2

Perform edge detection (vertically and horizontally separately) using Sobel's frequency domain filters, then combine the detected features into one image and present the results. Compare the frequency domain performance with the intensity domain performance .

Answer:

There are no major differences between frequency domain performance and intensity domain performance of the Sobel filter. However, they seem to "highlight" different range of frequences: high for vertical and low for horizantal frequency domain, and vice verca speaking of intensity domain.